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© Improved multiplicative signal correction method and apparatus. 



© An improved method and apparatus are dis- 
closed for processing spectral data to remove un- 
desired variations in such data and to remove inter- 
fering information present in the data. The method 
and apparatus corrects multiplicative effects present 
in the spectral data. Additive and interferent con- 
tributions can be corrected as well. In one aspect of 
the method, coefficients for a selected appropriate 
model are applied to the input spectral data based 
on first and second reference spectra. The spectral 
data are then corrected based on the estimated 



coefficients at least as to multiplicative errors for 
producing a linear additive structure for use in cali- 
bration, validation and determination by linear mul- 
tivariate analysis. The method and apparatus will 
improve the accuracy of spectral data structures 
derived from measurements using spectroscopy, 
chromatography, thermal analysis, mechanical vibra- 
tion and acoustic analysis, rheology, electrophoresis, 
image analysis and other analytical technologies pro- 
ducing data of similar multivariate nature. 
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IMPROVED MULTIPLICATIVE SIGNAL CORRECTION METHOD AND APPARATUS 



Field of the Invention 

The present invention relates generally to pro- 
cessing of spectral data to reduce undesired vari- 
ations and to remove interfering information 
present in the data. Specifically, the present inven- 
tion relates to an improved instrument or method 
for processing of spectra! data to reduce undesired 
variations and to remove interfering information 
present in the data. Most specifically, the present 
invention relates to an improved instrument, meth- 
od or process to provide improved measurements 
of anaiytes based on spectral data by reducing 
undesired variations and removing interfering in- 
formation present in the data. 



Background of the Invention 

Spectral data consists of multiple interrelated 
data points, such as an optical spectrum or a 
chromatogram, which carries information related to 
the components and characteristics of the speci- 
men from which the data was derived, as well as to 
the performance of the analytical instrument and to 
the general experimental conditions. In spectro- 
scopy, for example, this specimen is a material and 
the spectral data comprises the results of related 
measurements made on the specimen as a func- 
tion of a variable,, such as the frequency or 
wavelength of the energy used for measurement. In 
chromatography, the spectral variable may be time 
or distance. In thermal analysis, the variable is 
usually temperature or time. In mechanical 
vibration/acoustics analysis the variable is usually 
frequency. In rheology the variable can be position, 
shear rate or time. In electrophoresis and thin layer 
chromatography the variable is relative distance in 
one or two dimensions. In many different analyses, 
e.g. kinetic measurements, time is either the pri- 
mary variable or an additional variable that adds to 
the dimensionality of the data. 

In image analysis the fundamental variable is 
usually distance in one or two dimensions although 
the two-dimensional Fourier transform, also known 
as the Weiner transform, and the Weiner spectrum 
which express the information in a two-dimensional 
spatial frequency domain are also prevalent. Mul- 
tivariate images, such as three color video signals 
and many satellite images where each picture ele- 
ment is characterized by a multichannel 
"spectrum" and also images constituting a time 
sequence of information, provide additional dimen- 
sionality in the data. Alternatively, in image analy- 
sis, the images can be summarized into histograms 



showing distributions of various picture elements, 
where the variable is then a vector of categories, 
each representing a class of picture elements, e.g. 
various gray levels of pixels, or contextual classes 
5 based on local image geometry. For multivariate 
images, the additional multichannel information 
may be included in the contextual classification. 
Time information can likewise be included in the 
definition of the categories in the variable. The 
70 above descriptions of two-way images also apply 
to three-way tomographic images, e.g. in MRI and 
X-ray tomography. 

It should also be noted that it is possible to 
express spectral data more or less equivalently in 
75 several different domains, i.e. with respect to sev- 
eral different variables, for example by a Fourier, 
Weiner, or Hadamard transformation, and using 
different metrics, such as Euclidian and 
Mahalanobis distances. 
20 For all of the above types of spectral data, 

information from several specimens may be related 
to each other statistically to derive analytical in- 
formation. In order to derive specific desired ana- 
lytical information, such as the concentration of a 
25 constituent, the magnitude of a physical property, 
or the identification of the specimen or its compo- 
nents, one form or another of additive multivariate 
approximation or modeling is typically employed. 
For example, a desired parameter may be modeled 
30 as the suitably weighted sum of the measurements 
at selected data points within the spectrum or a 
weighted sum of previously determined reference 
* spectra. The weighting coefficients, sometimes re- 
ferred to as the calibration coefficients, are statisti- 
cs cally determined based on spectral data obtained 
from a set of calibration specimens for which the 
values for the parameter of interest are known. 

This additive multivariate calibration may be 
considered as a general interference subtraction, 
40 whereby each input spectrum is resolved as a sum 
of underlying structures, each with a known or 
estimated spectrum. The structures can be known 
or directly measured spectra of various individual 
phenomena affecting the input spectrum, or es- 
45 timated "loading" vectors (e.g. bilinear factors) that 
span their variability statistically. The resolution 
yields estimates of the level or score of each such 
phenomenon or factor in the input spectrum. Then 
the additive modeling performs the equivalent of a 
so weighted subtraction of the various interferants' 
spectral effects, thereby providing selectivity en- 
hancement. 

Additive multivariate approximation is appro- 
priate for purely additive structures, or by taking 
the logarithm of the data values, for purely mul- 
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tipiicative structures. The modeling is much less 
accurate and robust for mixed additive and mul- 
tiplicative structures. Unfortunately, real measured 
spectral data usually has some degree of such 
mixed structures including multiplicative effects 
that affect the analytical sensitivity. In diffuse re- 
flectance spectroscopy, for example, the scatter 
coefficient varies due to particle size. Even when 
grinding is appropriate, the resulting particles have 
a range of sizes, with a mean and distribution that 
is variable depending on both physical and chemi- 
cal factors, and a range of optical properties that 
vary with the wavelength itself as well as the par- 
ticle composition and physical shape. In transmis- 
sion spectroscopy, the effective optical pathlength 
may be affected by changes in geometry, scatter- 
ing, temperature, density of the material, and re- 
lated physical parameters. Variation in the amount 
of material added to the column produces mul- 
tiplicative effects in chromatography as does the 
intensity of the dye added to gel in electrophoresis. 
In image analysis, variations the total area of pixels 
counted and the pixel intensity can contribute mul- 
tiplicative factors to otherwise additive structures. 
Finally, instrumental and other experimental effects, 
e.g. a nonlinear instrument response, may appear 
as multiplicative factors, particularly when a loga- 
rithmic data transformation is applied. 

Much of the effort to overcome these effects 
has resulted from the increased use of near-in- 
frared diffuse reflectance spectroscopy, in which 
multiplicative effects are quite large although not 
necessarily obvious on first examination of spectral 
data. Near-infrared spectra tend to decrease in 
absorbance with decreasing wavelength because 
the absorption bands are based on several orders 
of overtones and combinations of mid-infrared vi- 
brational frequencies. Band strength decreases as 
the order of the harmonic involved increases, i.e. 
as the frequency increases. A multiplicative effect 
on such a tilted spectrum appears similar to the 
addition of a tilted baseline. Therefore, Norris and 
other early workers used the first or second deriva- 
tive of the absorbance spectrum with respect to 
wavelength in their models. The derivatives explic- 
itly remove any additive constant and, in the case 
of the second derivative, any linear sloped additive 
baseline. Unfortunately, the true multiplicative ef- 
fects remain after taking the derivative of the data. 

Removal of a multiplicative factor implies di- 
vision of the data by an appropriate value. Norris 
(K.H. Norris and P.C.Williams, Optimization of Math- 
ematical Treatments of Raw Near-Infrared Signal in 
the Measurement of Protein in Hard Red Spring 
Wheat. I. Influence of Particle Size, Cereal Chem. 
61<2):158 and K.H. Norris, Extracting Information 
from Spectrophotometric Curves Predicting Chemi- 
cal Composition from Visible and Near-Infrared 
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Spectra, Food Research and Data Analysis, H. 
Martens and H. Russwurm, Ed. Applied Science 
Publishers, Ltd. 1983 Essex, England, copies of 
which being annexed hereto and incorporated in its 
5 entirety by reference) introduced the use of deriva- 
tive ratios by 1981. In their approach, first or sec- 
ond derivative spectra are used so that any 
baseline offsets are eliminated. The absence of 
baseline offset in the divisor is a requirement to 
to maintain linearity when removing a multiplicative 
factor. Their method involves selecting a 
wavelength for the first numerator by examination 
of the correlation of the data at each wavelength 
with the values of the parameter of interest. A 

15 denominator wavelength is then selected by similar 
examination of the correlation of the ratio to the 
parameter of interest. Iteration involving changes to 
the data point spacing and smoothing used in the 
finite difference computation of the derivative is 

20 then performed to optimize the approximation. Ad- 
ditional terms may then be added to the model in a 
stepwise procedure. This method has been useful 
however, it is limited to a specific calibration using 
data at a few selected wavelengths. 

25 Murray and Jessiman (I. Murray and 

C.S.Jessiman, unpublished work (1982) quoted in 
Animal Feed Evaluation by Near Infrared Reflec- 
tance (NIR) Spectrocomputer paper presented at 
the Royal Society of Chemistry Symposium at the 

30 University of East Anglia, Norwich UK 23 March 
1982. A copy of which being annexed hereto and 
incorporated in its entirety by reference) developed 
a technique termed "mathematical ball milling" 
which provided a correction to the whole spectrum. 

35 In their technique, simple linear least squares re- 
gression (estimation of a multiplicative slope and 
additive offset parameter) is used to find the best 
linear fit of each spectrum, as well as of the aver- 
age of many spectra, (ordinates or regressands) to 

40 a vector representing the actual wavelength, e.g., 
nanometers (common abscissa or regressor). Each 
individual spectrum is then modified with respect to 
offset and slope such that the simple linear regres- 
sion line of the modified spectrum is coincident 

45 with the regression line initially obtained for the 
average spectrum. 

Martens, Jensen and Geladi (H. Martens, 
S.A.Jensen, and P. Geladi, Multivariate Linearity 
Transformations for Near-Infrared Reflectance 

so Spectrometry, Proceedings, Nordic Symposium on 
Applied Statistics, Stavanger, June 1983, Stokkand 
Forlag Publishers, Stavanger, Norway pp.205-234, 
a copy of which being annexed hereto and incor- 
porated in its entirety by reference) developed the 

55 method of "Multiplicative Scatter Correction" that is 
the forerunner of the present invention. They utilize 
a previously known reference spectrum representa- 
tive of the "ideal specimen". In practice this is 
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usually based on the average of the spectra con- 
tained in the calibration data set. Each spectrum, 
whether used for calibration, validation, or deter- 
mination, is then profected on this average spec- 
trum by simple linear regression over selected 
wavelengths and its offset and slope relative to the 
average spectrum thereby determined. Corrected 
spectra are then obtained by subtracting the appro- 
priate offset coefficient from each spectrum and 
then dividing the resulting spectral data by the 
slope coefficient The estimated slope coefficient is 
sometimes modified somewhat at different 
wavelengths in order to correct for wavelength de- 
pendency of the scatter coefficient. The resulting 
corrected spectral values equal the average spec- 
tral values plus residuals that contain the desired 
analytical information normalized to the average 
measurement conditions. This method, however, is 
subject to errors caused by the non-random nature 
and potentially large magnitude of these residuals. 

A prior approach to minimizing these errors 
has been to omit those portions of the spectrum 
having large variability from the data used in the 
regression. This approach is sometimes difficult to 
apply, because it may require many trials and 
operator judgments, and it is only partially success- 
ful at best. In a variation of this approach, the range 
of the spectral data included in the average spec- 
trum used to determine the offset and slope coeffi- 
cients is restricted to the vicinity of a strong iso- 
lated spectral feature, such as a solvent absorption 
band, thereby limiting the magnitude of the residu- 
als and improving the accuracy of the correction. 
This variation has been applied to correction of the 
effects of scattering within the specimen in trans- 
mission spectroscopy. In many cases, however, 
there is no strong isolated band available for deter- 
mination of the multiplicative correction. A related 
problem occurs in measuring one material through 
another with the pathlength through each material 
unknown and variable. In either case, better means 
are needed to accurately separate additive and 
multiplicative effects. 

Varying levels of known or unknown additive 
interferences also characterize the above forms of 
spectral data. In spectroscopy it is. common to 
have a background spectrum added to the desired 
data from sources such as absorption by the sol- 
vent used to dissolve the specimen for analysis, 
absorption by the reference used to determine the 
incident energy, nonspecific emission or fluores- 
cence from the specimen or instrumentation, and 
stray light, specular reflections, and other measure- 
ment artifacts. The other technologies discussed 
above have similar problems of additive interfer- 
ences. 

Specimen stability is often a cause of such a 
problem. For example, in near-infrared diffuse re- 
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flectance spectroscopy powdered specimens are 
common. The water content of many powdered 
specimens tends to equilibrate with the environ- 
mental humidity. In many cases, it is extremely 

5 difficult to maintain an adequate set of calibration 
and validation specimens with a sufficient range of 
water content to allow accurate calibration. Tem- 
perature also affects the spectra, particularly in the 
case of hydrogen bonded species such as water. A 

w small fraction of one degree Celsius temperature 
change can be readily detected in aqueous speci- 
mens. Adequate control of specimen temperature 
during measurement is difficult in the laboratory 
and often impossible in a processing plant environ- 

75 ment Other measurement technologies are subject 
to such difficult to control variables. A method to 
accurately remove the spectral effects of such vari- 
ables without disturbing the analyte information pri- 
or to use of the spectra for calibration, validation, 
20 and determination would improve the utility and 
performance of multivariate data analysis tech- 
niques. 

Manual subtraction of one or more background 
spectra from an unknown spectrum by graphically 

25 oriented trial-and-error is well known in several 
disciplines, e.g. in UV, VIS, and IR spectroscopy. 
This type of interference subtraction has the advan- 
tage of letting the user interactively apply his or her 
knowledge of the structures involved. Automated 

30 methods have been developed but these are sub- 
ject to significant errors, particularly where not all 
the constituent spectra are known, where constitu- 
ent spectra are influenced by changes in the envi- 
ronment, and where the background or interference 

35 spectra are correlated with the analyte spectra. 

In general, the above previous spectral correc- 
tion techniques have been based on assumptions 
that the data structures are linear in the param- 
eters. Various linearization techniques are applied 

40 to the data, most commonly the logarithmic trans- 
formation to convert purely multiplicative structures 
to additive form and, in diffuse reflectance spec- 
troscopy, the Kubelka-Munk function. While useful, 
these data transformations are based on the as- 

45 sumption that the structure is intrinsically linear. 
Physical and instrumental effects often add intrinsi- 
cally nonlinear elements to measured data struc- 
tures, even if the underlying phenomena is linear. 

50 

Objects of the Invention 



Accordingly, it is an object of the present in- 
vention to improve the accuracy of multivariate 
55 analysis of spectral data structures derived from 
measurements using spectroscopy, chromatog- 
raphy, thermal analysis, mechanical vibration and 
acoustic analysis, rheology, electrophoresis, image 
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analysis, and other analytical technologies produc- 
ing data of similar multivariate nature. 

It is an object of the present invention to more 
accurately correct spectra! data to reduce or elimi- 
nate multiplicative effects thereby improving and 
simplifying subsequent additive modeling. 

To accomplish this object, it is a further object 
of this invention to distinguish additive features, 
which in spectroscopy may be chemical or phys- 
ical, from multiplicative features, which in spec- 
troscopy are generally physical, thereby reducing 
the danger of confusing and destroying the desired 
information in the multiplicative signal correction 
process. 

It is a further object of this invention to provide 
error warnings if the additive (e.g. chemical) fea- 
tures are too similar to the multiplicative (e.g. phys- 
ical) features to allow reliable multiplicative signal 
correction. 

It is yet another object of this invention to allow 
for non-linear effects such as wavelength depen- 
dencies and wavelength shifts, by going from non- 
iterative linear to interactive non-linear modeling in 
the multiplicative signal correction. 

It is an additional object of this invention to 
provide a multivariate interference rejection filter 
which removes the spectral information due to vari- 
able interferences without disturbing the desired 
analyte information. 

It is still another object of this invention to 
integrate the multiplicative signal correction with 
the additive calibration regression or determination 
of unknown values. 

It is yet another object of this invention that 
these additive and multiplicative correction and in- 
terference rejection filter methods provide graphi- 
cally based interactive as well as fully automated 
operation so as to allow the users to use their 
judgement and experience in applying the methods 
if they so desire. 



Summary of the Invention 

In accordance with the present invention, a 
method is used wherein spectral data of each 
specimen is represented by a multivariate model 
using previously known spectral information as op- 
posed to only the average or "ideal specimen" 
spectral data utilized with simple linear regression 
modeling in the prior multiplicative scatter correc- 
tion method. In the first instance, where the mul- 
tiplicative corrections are of prime concern, the 
method encompasses incorporating the reference 
spectra of selected other components and, using 
multivariate estimation means rather than simple 
linear regression, determining the coefficients 
thereby resulting in substantially more accurate 



estimation of the magnitude of the offset and mul- 
tiplicative corrections due to the reduction of the 
amount of unmodeled information contained in the 
residuals. 

5 The present invention includes using any linear 

multivariate estimator to determine the correction 
coefficients such as multiple linear regression, gen- 
eralized least squares, maximum likelihood regres- 
sion, robust regression, estimated best linear pre- 
10 dictor, partial least squares, principal component 
regression, Fourier regression, covariance adjust- 
ment, or non-Euclidian distance measures. 

The present invention also includes using any 
non-linear multivariate estimation method to deter- 
15 mine the corrections, such as linearization by Tayl- 
or expansion, the steepest descent method, Mar- 
quardt's compromise, or simplex optimization to 
define coefficients minimizing the sum squared er- 
ror of the nonlinear model. 

20 In addition to subtracting the offset coefficient 

resulting from the multivariate modeling, the 
present invention also comprises, as option A, us- 
ing the coefficients of the interfering components 
derived by the linear or non-linear modeling to 

25 scale the spectra of these components, so that 
subtraction of the scaled spectra from the data can 
substantially remove their contribution from the 
data. r * 

The present invention also comprises, as op- 

30 tion B f generating modified reference spectra of the 
interfering components that contain only those por- 
tions of the original reference spectra that are or- 
thogonal to, and therefore uncorrelated with, one or 
more reference analyte spectra. The coefficients 

35 generated for these orthogonal spectra are not in- 
fluenced by the presence or magnitude of analyte 
information contained in the raw data even if the 
analyte spectrum is not included in the modeling or 
the coefficient estimator does not inherently or- 

40 thogonalize the components, so they may be a 
more correct representation of the magnitude of 
the spectral effects of the interfering components. 
These more accurate coefficients are then used to 
scale the original reference spectra prior to sub- 

45 traction from the input data and the correction 
proceeds as in option A. This option reduces or 
eliminates the error in analyte spectral contribution 
that would otherwise be caused by subtracting an 
incorrect amount of a spectrum which contains 

50 some information equivalent to analyte information. 

The present invention also comprises, as op- 
tion. C, the further scaling of the spectra of the 
.interfering components and, if desired, the spectra 
of the analyte(s) so as to control the degree of 

55 spectral modification and correction applied to the 
data. Spectral components may be removed, 
down-weighted, left as is, or emphasized by control 
of the weighting coefficients. 
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The present invention also comprises, as op- 
tion D, updating of the analyte and interference 
spectra based on the results of later stages of data 
processing and analysis, for example based on 
principal components analysis (PCA) or partial least 5 
squares (PLS). This is particularly useful in con- 
junction with "Signal Processing Method and Ap- 
paratus" US Application 07/319,450 filed 3/3/89 by 
Edward Stark t one of the co-inventors of the 
present invention. 70 

The present invention also comprises, as op- 
tion E, interactive displaying graphical output con- 
cerning which analyte and interference spectra, if 
any, are causing difficulties with respect to estima- 
tion of the multiplicative correction and interactive 75 
control over which reference spectra are utilized, 
the spectral range included in estimating the coeffi- 
cients, and the weighting of the additive corrections 
employed. 

For a better understanding of the present in- 20 
vention, reference is made to the following descrip- 
tion and accompanying drawings while the scope 
of the invention will be pointed out in the the 
appended claims. 



Brief Description of the Drawings 



Figure 1 is a perspective view of a spec- 
trophotometry sensor system for use in the 
performance of the methods and apparatus, of 
the present invention; 

Figure 2 is a general block diagram of the 
apparatuses used as components in conjunction 
with the present invention as shown in Figure 1 ; 
Figure 3 is a block diagram of the data normal- 
izer unit as used in the present invention; 
Figure 4 is a block diagram of a non-linear 
model coefficient estimator used in conjunction 
with the present invention. 

Figure 5 is a block diagram of a principal com- 
ponent regression device used in conjunction 
with the present invention. 

Description of the Preferred Embodiments 



The present invention has general applicability 
in the field of signal and data processing, wherever 
"spectra" or data structures consisting of multiple 
interrelated data points are obtained and the vari- 
ability in the data can be described as combined 
additive and multiplicative effects. These types of 
effects are common in many forms of measure- 
ment and previous efforts have been made to solve 
the problems, as discussed above. 

Like Norris, the present application uses di- 
vision to normalize the multiplicative variability but 
it employs all or most of the spectral information 



rather than one or a few selected wavelengths and 
does not depend on use of the derivative data 
transformation. 

Like Murray and Jessiman's "mathematical ball 
milling" it seeks to normalize every input spectrum 
to some reference or "average" state by additive 
and multiplicative normalization, and it allows an 
explicit correction for wavelength effects (most sim- 
ply by including wavelength as an extra additive 
"constituent" vector but more effectively through 
the use of non-linear modeling). In addition, it em- 
ploys a different regressor (an actual reference 
spectrum, e.g. an average spectrum) for determin- 
ing the multiplicative correction and it allows mod- 
eling and, if desired, subtraction of several addi- 
tional phenomena at the same time. 

Like the conventional Martens, Jensen, and 
Geladi multiplicative scatter correction (MSC), it 
seeks to normalize every input spectra to some 
reference "ideal" or "average" state by additive 
and multiplicative normalization, after having es- 
timated the offset and slope parameters by some 
type of regression against some reference spec- 
trum over some selected wavelength range, and 
this reference spectrum may be of the same kinds 
as those employed in MSC. However, it extends 
conventional MSC by explicitly modeling the ef- 
fects of anticipated additive interferences and by 
optionally utilizing nonlinear modeling in deriving 
the additive and multiplicative normalization. This in 
turn improves the accuracy of the multiplicative 
correction, it allows removal of undesired inter- 
ferants already at the multiplicative preprocessing 
stage, and it simplifies a causal understanding of 
the multiplicative correction and facilitates its inter- 
active graphical optimization. It may also create 
interference reference spectra orthogonal to the 
analyte(s) spectra for use in the modeling to avoid 
the effects of intercorrelation between the inter- 
ferant spectra and the analyte spectra which would 
otherwise cause inaccuracies in estimating the in- 
terferant coefficients and in the subsequent sub- 
traction of their contribution to the spectral data 
being normalized. 

Like the manual background subtraction, it also 
allows graphical interactive access, but in addition 
it employs statistical parameter estimation in the 
determination of how much to subtract. Like the 
general interference subtraction, it also allows mod- 
eling and subtraction of several phenomena at the 
same time but it combines additive and multiplica- 
tive modeling into one process and compensates 
for intercorrelation among the spectral components. 

If the physical situation results in a substan- 
tially linear combined additive and multiplicative 
structure, the measured spectral information may 
be considered as: 

X ki = T oi + R ks n si + R ka 'tai + R kj t ji + Q km "t mi +e ki 
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X is the spectral ordinate, e.g. absorbance, fluores- 
cent energy, or pixel intensity or relative count, 
representing the measurement system response. 
The subscript i denotes the object or specimen 
while subscript k (ksl^.-.K) is the spectral vari- 
able, k may be representative of a single dimen- 
sion, e.g wavelength in optical spectroscopy, two 
dimensions, e.g. time and wavelength in GC-IR 
measurements, or more depending on the mea- 
surement technology utilized. As used here, names 
of matrices are capitalized (e.g. R kj ) and a matrix 
may comprise a single row or a single column of 
elements, for example X ki , Y ki , and R ks are individ- 
ual spectra represented by single column matrices 
(vectors) of length K. Sets of multiple spectra form 
matrices (e.g. R ka , Rkj. and Q km ). Quantities that 
only exist as vectors or scalers are not capitalized. 

This physical situation described above is lin- 
ear in the parameters, i.e. the spectral data con- 
sists of the sum of spectral components R kn and 
Qkm that are functions of the variable k, each contri- 
buting to the spectrum of specimen i in an amount 
defined by the coefficients t ni or t mit the values of 
which differ from specimen to specimen but are 
not functions of k. e ki is additive random error in 
the spectral data. The spectral components may be 
considered the fundamental signatures of the un- 
derlying chemical or physical parameters being 
measured while the coefficients relate to the quan- 
tity of the parameter and the sensitivity of the 
measurement. Many fundamental physical pro- 
cesses generate such linear spectral data, e.g. the 
absorbance spectra of chemical mixtures measured 
by optical spectroscopy. t oi describes a additive 
offset and any additive baseline that is a function of 
k can be considered an additional spectral compo- 
nent Q km . Variations in the additive offset and the 
sensitivity of the measurement among data from 
different specimens contributes the additive and 
multiplicative errors for which this invention pro- 
vides improved data corrections. 

The fundamental improved method of data nor- 
malization provided by this invention is based on 
the use of previously obtained reference spectra 
Rks. Rka. and R k j to model the input spectral data 
X ki . Therefore, they are separately considered in 
the equation above. Q km describes spectral infor- 
mation present in the input data that is not repre- 
sented by any of the reference spectra. The objec- 
tive is to include sufficient spectral information in 
Rks. Rka and R kj so that Q km *t mi is small enough that 
it may be safely neglected. 

R ks is the primary "standard" spectrum used 
as the basis for determining the multiplicative cor- 
rection coefficient. Typically, it represents the aver- 
age spectrum of the class of specimens, the spec- 
trum of the solvent within which the analyte is 
dissolved and to which the analyte concentration is 



referenced (e.g. molality), or the spectrum of a 
naturally occurring or artificially introduced tracer 
material. On the basis that the offset t oi is an 
artifact that should be removed, and that it is 
5 desired to normalize the coefficients of each spec- 
trum so that the standard component R ks always 
has the same contribution in the data, a corrected 
spectrum Y ki can be defined as 
Y k f = (X kf - toi)/t S j. 
70 In order to perform such a correction, the values of 
t oi and t sj must be derived from the data X kf sepa- 
rately for each specimen i. Improved methods and 
apparatus for estimating these values are the sub- 
ject of this invention. 

is R ka are one or more reference spectra repre- 

senting the expected influence of the analytes of 
interest on the input data. In this context analyte is 
used broadly to indicate the quality sought in the 
subsequent analysis of the data, for example a 

20 quantity of a constituent or an identification of the 
specimen or one or more of its components based 
on a between-class discriminant function. R kJ - are 
reference spectra representing the expected influ- 
ence of various undesired interferences, whether 

25 chemical or physical interferences in the specimen 
or artifacts introduced by the instrumentation, on 
the input data X ki . 

If R ks is the ideal spectrum e.g. the average 
taken under the same measurement conditions, the 

30 expected value of t si is 1. If R ks is a pure solvent 
taken under the same conditions, t si is expected to 
be less than 1, depending on solute concentration. 
If R ks is taken under different pathlength or con- 
centration conditions, t si can be less than or greater 

35 than 1. t a i and tjj are related to the concentration of 
the components and differences in the measure- 
ment sensitvity between the data for R kn and that 
for X ki . 

The reference spectra represent previously 

40 known more or less accurate information about how 
the qualities sought in the subsequent analysis 
(e.g. analyte concentrations or between-cfass dis- 
criminant functions) and various interferences are 
expected to affect the input data. These reference 

45 spectra can be based on direct physical measure- 
ments of individual specimens, direct physical 
measurement of the separate constituents, or sta- 
tistical summaries or estimates of the spectra 
based on sets of specimens. For example, R ks may 

so be the average of all the spectra obtained by 
measurement of the calibration set of specimens or 
the result of a careful measurement of a solvent 
\ blank. It is desireable that the spectral characteris- 
tics of R ks be stable. If R ks is the average spec- 

55 trum, this implies use of a reasonably large number 
of representative individual spectra in computing 
the average spectrum. In the case of a solvent 
spectrum R kSl it is often desireable to characterize 
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the solvent by more than one spectrum to encom- 
pass possible variations due to, for example, the 
influence of specimen composition, temperature or 
other environmental factors. The most stable spec- 
tral component is then used as R ks and the spectra 
representing deviations or variations in the solvent 
spectrum are included in R kj . 

The R ka and R kj reference spectra are often 
statistical estimates extracted from the measured 
data from sets of specimens, although directly 
measured spectra are also useful in many cases. 
Honig*s spectral reconstruction (D.E. Honigs, 
G.M.Hieftje, and T. Hirschfeld, A New Method for 
Obtaining Individual Component Spectra from 
Those of Complex Mixtures, Applied Spectroscopy, 
38(3),pp. 317-322, a copy of which being annexed 
hereto and incorporated in its entirety by reference) 
provides a method for extracting spectra from a set 
of mixture specimens based on knowledge of the 
concentration values. Principal component analysis 
(PCA) and partial least squares (PLS) provide or- 
thogonal sets of spectra representative of the vari- 
ation in the data. Stark's method (US Patent ap- 
plication 07/319,450) provides reference spectra for 
previously unknown variations based on analysis of 
replicate data. In the simplest operation of the 
present invention, i.e. correction for offset and mul- 
tiplicative factors, the primary requirement of R ka 
and R kj is that they reasonably span the variation of 
X ki so as to stabilize the modeling and spectral 
accuracy and specificity are not essential. For the 
more complex options, in which R ka and R k] are 
incorporated into the output data as corrections, the 
quality of R ka and R kj become more important. The 
accuracy and specificity of the R ka spectral data is 
particularly important in orthogonalization of R kj ei- 
ther explicitly or implicitly and when used for ad- 
ded weight as described below. The spectral in- 
formation in R ka and R kj may be represented in 
various ways with respect to redundancy and col- 
linearity, for example one individual vector for each 
phenomenon, several replicates or specimens, sta- 
tistical summaries (averages, bilinear components, 
square root of covariance matrices, etc.), or rotated 
representations where some or all collinearities 
have been eliminated. In a preferred embodiment, 
redundancy is eliminated by averaging so that the 
number of vectors equals the number of phenom- 
ena being modeled. 

An intrinsically nonlinear data structure may 
arise because the physics of the measurement 
and/or instrumentation has introduced characteris- 
tics differing from those described above. One 
common type of nonlinearity is analytical sensitivity 
(e.g. gain, pathlength) which is a function of the 
variables. A more general description of the input 
data then takes the form 

X ki = C ki + D ki [t oi + R ks *t sj + R ka 't ai + Rk^jj] + 
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Ski 

= Cki + D ki [R kn *t ni ] 

This describes a structure resulting from nonlinear 
distortions of the fundamentally linear structure de- 
5 scribed by [R kn *t n i] where 

Rkn = [1 .Rks.RkaiRkj] 

and any other spectral data, for example the 
Qkm*tmi. are included in the error e ki . It will be 
appreciated by those skilled in the art that other 
70 forms of nonlinearity may arise which can also be 
described as distortions of the basic linear struc- 
ture above. 

In the present case, C k( and D ki may each be a 
function of the spectral values X kil of the 

75 wavelength k ( or of both. In the usual nonlinear 
case of small but significant deviations from linear- 
ity, the values of C ki and D ki are will be close to 0 
and 1 respectively. If the nonlinearity is negligible, 
C ki can be set equal to 0 and D ki equal to 1. The 

20 resulting structure is then equivalent to the linear 
additive and multiplicative structure discussed 
above. 

The form of C ki and D ki are related to the 
causes of nonlinear behavior. For example, in 

25 spectroscopy the amount of scattering and there- 
fore the effective pathlength may vary as a smooth 
function of wavelength. Refractive index effects 
have similar smoothly varying forms with respect to 
wavelength. Therefore, D ki may be a smooth but 

30 non-linear function of wavelength. On the other 
hand, the effective pathlength may be a smooth 
function of absorbance, as in convergence error in 
transmission spectroscopy, where an increase in 
reduces the energy from longer pathlengths more 

35 than from shorter ones, thereby making the effec- 
tive pathlength grow shorter as absorbance in- 
creases. Measurements in transflection mode, 
where convergence error is maximized, resulted in 
use of a model that gave excellent correction 

40 X corrected = bo + X*bi + X 3 *b 3 

where X is measured log(1/R). In general both k 
and X variables should be included in the nonlinear 
model, for example in transflection of scattering 
samples. 

45 For generality, the scattering pathlength and 

similar multiplicative effects can be described as a 
function of k in accordance with the series expan- 
sion 

Dki = d 0i + dij-k + d 2i *k 2 + d 3i V + .... 
so Again for generality, convergence error and other 
effects that affect the linearity of the value of X can 
be described by 

X ki real = c 0i + CifXw + c 2i *X kl 2 + c 3 i*X ki 3 + .... 
Therefore, a general form for C kl - and D ki can be 
55 described as a product of these series, ie a new 
series in terms of the powers of k and X and their 
cross products, removing redundant constants and 
terms in k or X and normalizing so that the linear 
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magnitude information is kept in [R kn *t n j] and C k j 
and Dm carry only the information relating to the 
nonfinearity. C k i and D ki are matrices containing k 
rows, and as many columns as required for the 
number of terms in the appropriate power series 
approximations. 

It is the underlying intrinsically linear additive 
structure that is desired for later analysis steps 
using linear multivariate calibration, validation and 
determination procedures. Therefore, the corrected 
spectra Y ki are formed by 
Y ki = {(X ki - C ki )/D ki - b oi }/b si 
= [Rk S + Rka*b aI + R kj *b if ] + e ki 
where the b n j are estimates of the true t ni and C kil 
D ki , and b ni are derived from the data X kj . 

The corrected spectrum Y kj comprises the 
standard spectrum R ks and linear additive devi- 
ations from R ks caused by analytes, interferants, 
and errors, and it is therefore suitable for further 
linear data analysis. 

Preferred embodiments of the above are illus- 
trated in the figures and further described below. In 
Figure 1 for instance a photospectrophotometric 
sensor system 100 as used in the present invention 
is described. This system can be used, for exam- 
ple, in the determination of analytes in blood or, for 
instance, in the display of glucose levels in blood. 
This sensor system 100 comprises an optical 
source 110, for example a General Electric type 
EPT tungsten halogen projection lamp 1 1 1 moun- 
ted in a housing 112 containing a fan 113 for 
cooling and coupled to a 1 cm. diameter fiber optic 
bundle 120 for transmitting energy to the speci- 
men, e.g. the surface of the skin of a patient. 
Energy transmitted through the tissue is collected 
by a second fiber optic bundle 130, which trans- 
mits it to the spectrophotometer 140. This spec- 
trophotometer comprises an entrance slit 141, a 
concave holographic grating 142, and one or more 
diode array detectors 143 and their associated 
order sorting filters 144, arranged to measure en- 
ergy at different points in the spectral image 
formed by the holographic grating, and therefore at 
different wavelengths within the visible and near- 
infrared regions of the electromagnetic spectrum. 
Each detector channel has an associated pream- 
plifier 145, the output of which is multiplexed by 
multiplexer 146 into a programmable gain and off- 
set amplifier 147. The spectrophotometer 140 is 
further described in "Improved Grating Spectrom- 
eter", a US patent application filed August 24, 1989 
by Edward Stark, one of the coinventors of the 
present invention. The contents of said application 
are incorporated herein by reference. 

As is shown in Figure 2, the time sequential 
multiplexed analog signal is then converted to digi- 
tal form by an analog to digital converter 201 in 
data acquisition system 200. In preprocessor 202, 



the energy data is processed to eliminate instru- 
mental offsets and to reduce both systematic and 
random noise and then ratioed to obtain data relat- 
ing to transmission of the specimen. This data is 
s then linearized with respect to the analyte informa- 
tion of interest, e.g., the logarithm of transmission 
is more or less linear with concentration of chemi- 
cal constituents within the specimen. The data is 
then in form to be normalized in accordance with 
w the methods of this invention. Although the details 
may differ, similar functions are utilized in obtaining 
spectral data of the other forms discussed above. 

The additive and multiplicative corrections of 
this invention are performed by data normaiizer 
rs 300, which comprises special purpose digital com- 
putation logic. After normalization, the data may be 
further processed in processor 400 prior to use for 
multivariate calibration, validation and determination 
of unknown values. For example, processor 400 

20 may comprise the invention of "Signal Processing 
Method and Apparatus" US Patent Application 
07/319,450 filed March 3, 1989. The contents of 
said application are incoporated herein by refer- 
ence. Finally, the data is analyzed in the data 

25 analyzer 500 which performs the functions of mul- 
tivariate calibration, validation, and determination 
required to generate the analytical values* then pre- 
sented on display 600. 

The data normaiizer 300 of the present system 

30 illustrated in Figure 3 provides a number of options 
for processing the input spectral data X ki . The 
basic improved method of data normalization pro- 
vided by this invention is based on the use of 
reference spectra R ks , R ka , and R k j stored in the 

35 reference spectra storage 310 to model the input 
spectral data X ki by means of the coefficient es- 
timator 320, and to determine corrected spectral 
data Y ki by means of calculator 330. Functions 340, 
350, and 360 provide additional options which are 

40 bypassed for the basic corrections. The control and 
logic sequencer 370 provides the timing, data se- 
lection, and control signals required to perform the 
selected functions in proper sequence. 

In a preferred embodiment shown in Figure 3, 

45 this correction is implemented by the calculator 
function 330 comprising subtractor 331, divider 
332, subtractor 333, and divider 334 that perform 
successive operations on X ki involving the coeffi- 
cients C kj , D kfl b oil and b si generated by the coeffi- 

50 cient estimator 320. These operations are per- 
formed sequentially element by element by index- 
ing k with a first counter and performing the re- 
quired sequence of digital arithmetic functions un- 
der logic control based on the state of a second 

55 counter. These digital arithmetic functions are avail- 
able digital logic functions utilized in computers, 
and may conveniently be obtained in the 80287 
math coprocessor device or an array processor. 
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In a preferred addition to the basic preferred 
embodiment described above and shown in Figure 
3, additional correction spectra [R ka *b a i] and [R k j*bjj] 
are formed by matrix'multiplier 340 from the refer- 
ence spectra and their associated coefficients that 
were also generated in coefficient estimator 320. A 
matrix multiplier to form the spectrum [Rkn'bnt] for a 
single input spectrum i, consists of short term 
storage for the both inputs, a multiplication and 
summation circuit, an address sequencer which 
accesses the corresponding elements n of R kn and 
B nk and a second address sequencer which ac- 
cesses the rows k of R kn and addresses the short 
term storage which keeps the resulting k x 1 ma- 
trix. Matrix multiplication is also a standard function 
of available array processors. This additional com- 
bined correction spectrum may be used directly by 
subtractor 333 to further correct Y ki , which be- 
comes 

Y ki = {(X ki - C ki )/D ki - b of }/b si -[R kj *bjj] = [R ks + 

Rka*b ai ] + e ki 

Here the [R ka *b a i] represents the analyte(s) of inter- 
est, which must not be removed. Ideally, for a 
single analyte, [Y ki - R ks ] is simply the analyte 
spectrum whose scale factor represents the 
amount of analyte present. 

Often, only a subset of the possible [R^b^] 
corrections are applied at this stage of data pro- 
cessing, because later additive linear modeling 
may be more effective than the spectral subtraction 
based on previously known reference spectra per- 
formed here. The [R k fbji] should include, however, 
those interferants that are difficult or impossible to 
adequately represent in the calibration data, e.g. 
moisture and temperature as previously discussed. 

Greater control of the situation is provided in a 
preferred embodiment that also incorporates multi- 
plier 351 and component weight storage 352. In 
this embodiment, the amount or weight ha or hj of 
each correction spectrum that is subtracted in for- 
ming Y ki can be controlled by the operator or in 
accordance with information obtained in later data 
processing steps. The corrections then become ha- 
[Rka*b ai ] and hjfRkfbji]. The corrected spectrum 
then becomes 

Y ki = {(X kl - C ki )/D kj - b oi }/b si - hjtR^bji] - h a [R ka *b ai ] 
= Rks + (l-hj)[Rka*bad + <l-ha)[R ka *b ai ] + e ki 

A weight of hj = 1 is used when complete 
cancellation is desired, while a weight of hj = 0 
provides no correction for interferant j and ha = 0 
preserves the analyte signal unchanged. Values 0 
< hj < 1 may be used to downweight information 
that has uncertainty or potentially harmful effects 
on later data analysis without total rejection, ha < 0 
increases the weight of the analyte information, 
thereby reducing the relative importance of other 
information in the corrected signal. 

The orthogonal component generator 360 pro- 



vides for transformation of the reference spectra 
[Rks.Rka.Rkj] into a new set of spectra, [P ks .Pka.PkjL 
some or all of which are orthogonal to each other. 
If the reference spectra are latent variables derived 

5 from a single PCA or PLS analysis, they are or- 
thogonal by definition. If they are measured spectra 
of components or otherwise separately derived, 
they will generally be intercorrelated, which if se- 
vere enough may cause errors in the coefficient 

70 values or failure of the coefficient estimator to 
complete its operation. If orthogonal reference 
spectra are created, new reference spectra may be 
added without requiring complete recalculation by 
the coefficient estimator. Orthogonal reference 

75 spectra also minimize the number of operations 
required by the coefficient estimator to determine 
the coefficients. 

In a preferred embodiment, the orthogonal 
component generator performs a Gram-Schmidt or- 

20 thogonalization in accordance with 
Z,T = <I-Z<Z'Z)-1Z')Z, 
= Z f -Z(Z 'ZHZZi 
where 

I = the identity matrix 
25 Z ~ the matrix of vectors already transformed, 

Z f = the column vector of X to be transformed, 
and 

ZjT = transformed vector orthogonal to vectors 
already in Z. 

30 Z - transpose of Z, [ ]" 1 = inverse of [ ] 

Z comprises orthogonal columns therefore [Z'Z] is 
diagonal of size (i)x(i) and determining [Z'Z] -1 is 
trivial by inversion of the individual elements. 

The first reference spectrum to be ortho- 

35 gonalized is R ks (i = 2,z'z = K from column of Vs) 
whereby 

Pks = Rks - (Sum R ks )/k = R ks - average R ks . 

The variations of R ks are preserved in P kS( 
therefore the coefficient bs is not affected by the 

40 orthogonalization. Each succeeding R kn is then or- 
thogonalized against the matrix formed by the pre- 
ceding orthogonal P kn spectra, until all reference 
spectra are orthogonalized into matrix P kn = 
H.Pks.Pka.Pkj]. Each spectrum P kn comprises the 

45 residuals of the regression of R kn on the preceding 
orthogonal P kn . If a spectrum P kn is 0 or has only 
small values, it provides warning of dependence 
between spectra that could cause problems in co- 
efficient estimation. In such case, the information is 

50 provided to the operator, or separate decision cir- 
cuitry, to determine whether to delete the spectrum 
from the model, to downweight its importance, or to 
accept it without change. Orthogonalization pro- 
cessing may be performed solely for the purpose 

55 of generating this warning information. When full 
orthogonalization is chosen, the reference spectra 
input to matrix multiplier 340 are the P kj and P ka . 
The orthogonal component generator and stor- 
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age 360 comprises storage for P kn , the portion that 
is filled as the process proceeds comprising Z, 
storage for [ZZ]-1, storage for the intermediate 
product Z(ZZ)-1, storage for Z i( storage for the 
intermediate product Z Z it point by point multiply 
and sum logic, scalar inversion (1/a) logic, a sub- 
tractor, and the sequencer to select data from 
storage for processing, to control the processing 
sequence, and to direct storage of results. Circuit 
devices to perform these functions include the Intel 
80287 math coprocessor for hardware implementa- 
tion of the arithmetic functions, CMOS static ram 
chips (e.g. 4 parallel Motorola MCM6226-30 
128Kx8) to provide 32 bit resolution in storage of 
the digital data, and standard programmable array 
logic devices (PAL*s) combined with a clock and 
counter as the sequencer. Each matrix element is 
acted on in sequence in accordance with the hard- 
ware logic. The required functions can also be 
obtained with a standard array processor operated 
in sequential fashion by the sequencer. 

Operation is as follows after clearing to 0's: 

1. Set n and the first column of the P kn storage 
to 1's. (Z = Pk1) 

2. Set the first element of [Z'Z]-1 = 1/K 

3. Set all elements of Z[z'z]-1 = 1/K 

4. Increment n 

5. Move spectrum R kn to Z, storage (R ks for 
n = 2)(Kx1) 

6. Multiply and sum to form Z Zj (n-1x1) 

7. Multiply and sum to form an element of Z- 
[z'z]-1ZZj (Kxn-1) 

8. Subtract sum from the same element of Z } 

9. Store in that element of column n of P kn 
storage (K x n) 

10. Repeat 7-10 for K points in spectrum R kn to 
getP kn (Z = Pk1..P kft ) 

11. Multiply and sum to form z/z,- (scaler) 
12.lnvert (1/a) and store in nth element of [Z'Z]- 
-1 (n x n) 

13. Multiply and sum to form new elements of Z- 
[ZZ]- 1 (K x n) 

14. Repeat from step 4 until all R kn are used. 
(Z = P kl ..P kN ) (KxN) 

15. End 

The contents of Z[Z Z]-i is the transpose U of 
U = [ZZ]-1Z which is useful in finding multiple 
linear regression coefficients by matrix multiplica- 
tion. 

Full orthogonalization may modify the spectra 
so drastically that it becomes difficult to recognize 
their origin and the associated coefficients are thor- 
oughly aliased compared to the original quantities 
represented by the reference spectra. This is par- 
ticularly troublesome when interference subtraction, 
interference downweighting, or analyte enhance- 
ment is desired. These factors often make it de- 
sireable to perform less drastic processing. 



An alternative preferred embodiment ortho- 
gonalizes each interferant spectrum only against 
the analyte spectrum by the simple linear regres- 
sion model, R kn = b on + Rka'ban + ©knl Pkn ~ Rkn 
5 - b on - b an *Rka 

With this procedure, R ka may be omitted in the 
estimation of the b k \ coefficients without causing 
errors in their determination. This method has the 
advantage of only removing analyte related infor- 

io mation from the reference spectra, thus the analyte 
spectrum is unaffected, the interferant spectral 
shapes are minimally affected, and the coefficients 
have physical interpretations. In this case, the cor- 
rect input to the matrix multiplier 340 is R ki rather 

15 than P k j, to properly subtract the portion of R k j 
correlated with R ka . Implementation of this digital 
logic requires only a subset of the functions de- 
scribed previously. 

If even this degree of reference spectrum 

20 modification is undesireable, the orthogonal com- 
ponent generation is bypassed and the original 
reference spectra are passed to the coefficient 
estimator. 

The coefficient estimator 320 mathematically 

25 determines the coefficients applicable to the var- 
ious components used to model the input data. In 
general, the coefficient estimation process involves 
creating a model representative of the input spec- 
tral data that is a function of the reference spectral 

30 data and, in nonlinear models, of other variables 
such as the input spectral data itself and k. 

In the linear case, taking R kn = [1 .R ks .Rka»RkjL 
a matrix where each row represents observations at 
a value k of the spectral variable and each column 

35 is a reference spectrum R kn incorporated in the 
model, coefficient estimator 320 fits X ki to Rnk by 
some method, minimizing the residuals e ki in 
X ki = b oi + R ks *b si + R ka *b ai + R ki *bii + e ki . 

Methods for achieving this linear modeling in- 

40 elude generalized least squares, maximum likeli- 
hood regression, robust regression, estimated best 
linear predictor, partial least squares, principal 
component regression, Fourier regression, 
covariance adjustment, and others. For example, 

45 generalized least squares with generalized inverse 
models X ki by 

X kj = R kn *bni + ei where b ni is [boi.bsi.bai.bjj] cal- 
culated by b ni = [R'Vflr'Rl-R'vpr 1 X ki 
where [ ]~ means a generalized inverse and where 

so covariance matrix V(i) can be iteratively updated 
based on the previous fit for this specimen i. 

However, a preferred embodiment uses the 
,more usual linear modeling performed by multiple 
linear regression where 

55 b ni = [RRr'RXw. 

When the specific R kn to be used are known in 

advance, 

Unk = [RRrR 
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can be precomputed externally and stored with the 
reference spectra, thereby minimizing the require- 
ments on the data normalizer 300. If full rank 
Gram-Schmidt orthogonalization is used , U n i< is 
available from that process. In either case the cal- 5 
culation of the coefficients of the linear model in- 
volves a simple matrix multiplication. A matrix mul- 
tiplier for U nk and X ki consists of short term storage 
for one or both inputs, a multiplication and summa- 
tion circuit, an address sequencer which accesses io 
the corresponding elements k of U n i< and X ki and a 
second address sequencer which accesses the 
rows n of U nk and addresses the short term storage 
which keeps the resulting b n i values. In the more 
general case of multiple linear regression, matrix rs 
[R R] must be formed and inverted to obtain [R'R]- 
_1 prior to matrix multiplication by R' to obtain U n k- 
This function can readily be accomplished with an 
available array processor and suitable logic se- 
quencer. 20 

A second preferred embodiment of the coeffi- 
cient estimator which avoids matrix inversion is a 
principal components regression (PCR) device, 
which requires no pretreatment of R kn and no ma- 
trix inversion. 25 

In the case of nonlinear modeling, the coeffi- 
cient estimator 320 becomes more complex as 
each nonlinear coefficient becomes a vector of 
length k. C ki and D ki are therefore matrices contain- 
ing a number of coefficient vectors that depends on 30 
the form of the nonlinear model. 

These coefficients can not be determined by 
multiple linear regression or other bilinear methods 
so an iterative procedure must be used. Methods in 
the literature include linearization by Taylor series, 35 
steepest descent, Marquardt's compromise, and 
simplex optimization. (N. Draper and H. Smith, Ap- 
plied Regression Analysis, Second Edition, John 
Wiley & Sons, New York 1981 pp. 458-465) 

A preferred embodiment uses the coefficient 40 
estimator 320 illustrated in Figure 4 which employs 
Taylor series linearization. The model response 
generator 321 calculates the vector F from the 
reference spectra Rkn, the present value Ari of the 
coefficients being generated by the iterative pro- 45 
cess, the variable k and the input spectral data Xki. 
This operation involves matrix multiplication and 
summation in accordance with the appropriate form 
of model as discussed above. The set of coeffi- 
cients Ari, comprising cni of Cki, dni of Dki, and 50 
bni, are initially stored in coefficient Aq-1 storage 
322a. They may be modified by means of adder 
322b through addition of a weighted correction 
wGq or of increment dAr to one of the coefficients 
at a time to create the present values stored in 55 
coefficient Aq storage 322c and used by model 
response generator 321. The remaining functions 
will become obvious from the following description 



of the operation. 

1. Initialize A(q-1), k, F(Aq), and iteration counter 
q to 0 

2. Load Rkn into the partial differences Zr stor- 
age 323b 

3. Regress Xki on Rkn to determine the linear 
model bni 

4. Set w = 1 and add bni to A(q-1) to put bni in 
Aq storage 322c 

5. Set dOi = 1 in Aq and generate F(A0), store 
in 323c 

6. Transfer Aq from 322c to A(q-1) storage 322a 

7. Sequentially increment Aq values by adding 
dAr by 322b 

8. Calculate F(Aq+dAr) and subtract F(Aq) to 
get Zr 

9. Store in Zr and iterate 7,8,9 for all r 

10. Form X-F(Aq) and regress on Zr using 324 
to form Gq 

11. Compute SSq, compare to prior value, and 
select weight by 326 

12. Added weighted Gq to A(q-1) to form next 
Aq 

13. Compare Gq/Aq with stop criterion, if greater 
next q, else end 

This operation is controlled by sequencer 327. 

It should be obvious to those of ordinary skill in 
the art that some of the above operations may be 
performed in different order without significantly 
affecting the results obtained. It should also be 
obvious that the functions shown can implemented 
with common digital logic circuits well understood 
by those of ordinary skill or by available microcode 
controlled array processors, such as the Data 
Translation Model DT7020 with the MACH DSP 
Subroutine Library of microcode. Copies of the 
applicable Data Translation 1988/99 Data Acquis- 
tion Handbook pages have been annexed hereto 
and incorporated in in their entirety by reference. 
One variation in the method used to estimate and 
correct for multiplicative effects is to fit an additive 
model and a multiplicative model in an interactive 
sequential fashion. 

1 . Let X = (x ik ) be the matrix of spectral ordinates 

for i = l,2 N objects, k = 1,2,...,K wavelengths. 

The multiplicative effect is modeled from the 
spectral data using a standard multiplicative 
scatter correction (i.e., avoiding the use of com- 
ponents for practicing the present invention) 
yielding the corrected spectral data Z. 

2. Estimate an additive model: 

2 = rZmeen + D*P' + E 

. where 1 is a vector of ones of size k, z mea n is 
the mean vector of Z, and P = (p k i) spans the 
spectral variations of analytes and interferences 
as well as possible and/or to the extent the user 
wishes. P may include any of the following: 
input component spectra, estimated component 
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spectra, loadings from a PCA or PLS analysis of 
residuals after fitting estimated component spec- 
tra, or loadings from PCA or PLS analysis of X 
or Z. D represents the associated vector of 
weights obtained by PCA or PLS, for example. s 

3. Reconstruct the spectral data without the es- 
timated additive effects, 

Y = X - D*P' 

4. Estimate the multiplicative effects on Y using 

one of the methods proposed in this invention. io 

5. Construct a new matrix of corrected spectra Z 
from X and repeat step 2, step 3, and step 4 
until convergence occurs. 

6. Following performance of steps 1-5, the fi- 
nally corrected spectra, Z, may be used as a 15 
multiplicative corrected input spectra or from 

this finally corrected spectra, Z, the desired D 
and P factors which one wants to take away 
may be subtracted. 
In this method, the multiplicative effects, say 20 
from a physical model, and the additive effects, 
say from a chemical model, are obtained at sepa- 
rate steps in the process. However, the results of 
each model are adjusted for the effect of the other 
model. That is, the results are adjusted for the 25 
multiplicative effects and the additive and inter- 
fered effects present in the bilinear factors which 
are chosen for elimination. In addition, this tech- 
nique allows for a wide variety of choices of kinds 
of components to include in the chemical model 30 
varying from known interferents and component 
spectra through statistically estimated PCA or PLS 
factors. 

The fundamental improved method of data nor- 
malization provided by this invention is based on 35 
the use of previously obtained analyte and refer- 
ence spectra to model multiplicative effects on 
spectral data, although use of the invention does 
not specifically require the estimation of multiplica- 
tive effects directly from the input spectral data 40 
using said reference spectra. Rather, the multiplica- 
tive effects can be modeled from coefficients 
and/or loadings derived from statistical analyses 
(e.g. multiple linear regression, principal compo- 
nent analysis, partial least squares, and generalized 45 
least squares) of spectral data. The multiplicative 
effects obtained in this way can be used to correct 
the spectral data for multiplicative effects. 

For example, if the physical situation results in 
a combined additive and multiplicative structure, so 
the measured spectral information may be consid- 
ered as 
X = T*P' + E 

where X = (x ik ) is the matrix of spectral ordinates for 

i = 1,2 N objects, k = 1,2 K wavelengths, T = (t H ) 55 

is the matrix of scores for objects i, bilinear factors 

1 = 1,2 L obtained from some bilinear model (e.g. 

principal component analysis, partial least squares, 



etc.), P = (Pki) are the loadings for objects i on 
bilinear factors I, and E = (e ik ) are the residuals 
between data X and model TP'. The loadings P 
can then be decomposed into a function of a 
reference spectra r = (r k ) (e.g. the mean of the X 
data) and a matrix G = (g km ) spanning the spectra 
for analyte and interference phenomena 
m = 1,2,..,M: 

p'=dV + h*l' + C*G' + F 

where d = (di) and h = (hi) are vectors of length L, 
1 is a vector of ones of length K, C = (ci m ) is a 
matrix of regression coefficients of size LxM which 
quantifies the analyte and interference contribu- 
tions, and F = (f lk ) contains the residual loadings 
with the multiplicative, analyte, and interference 
phenomena removed, d, h, and C can be estimated 
by regression of p' on r, 1 , and G by some method 
(e.g. weighted least squares). C*G' could be re- 
duced in size by elimination of effects if the relative 
size of the chemical or interferent effects are small. 

The additive and multiplicative effects for the 
input spectra can be obtained from the loadings 
and scores by 
a = Th and 
b = T*d 

If the mean values of vectors a and b are a mean and 
bmean. respectively, the input spectra, corrected for 
additive and multiplicative effects, can be deter- 
mined by 

Y f . = [X,. + a mean - as] * b me an/bi 
The quantities (a mean -aj) and (b mea n/bi) appear in the 
equation to scale the corrected spectra such that 
the individual spectrum's additive and multiplicative 
corrections are made relative to the overall additive 
and multiplicative effects. 

In addition, the input spectral data can be cor- 
rected simultaneously for interferent contributions 
and additive and multiplicative effects, 
Y, - PC. = T*C % G*' -a, + a mean ] tw^, 
C* and G* are user-chosen subsets of C and G 
which include those interferents and anaiytes of 
interest which it is desired to eliminate from the 
input spectral data. The corrected spectrum Yj, 
represents the original input spectral after correc- 
tion for the additive, multiplicative, and interferent 
effects present in the bilinear factors. 

A modification to the above technique includes 
the method whereby the additive, multiplicative and 
interferent effects are modeled from the coeffi- 
cients and/or loadings of multivariate statistical 
techniques and the corrections are applied directly 
to the multivariate scores rather than to the input 
spectral data. 

Using the prior example where the input spec- 
tral data X are modeled using a bilinear model, 
X = T*p' + E 

the offsetrcorrected and interferent-corrected spec- 
tral data can be defined as Z = (z ik ) where 
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z = x- rhi' - *rc % G*' = T^p'-hi'-c^G"') + e 

where T, h, 1, P, (T, and G* are defined as above. 
Z represents a general case. More specifically, Z 
can be corrected for 'offset and/or a subset of the 
anaiyte and interference information contained in C. 
In practice, if additive correction is desired, the 
offset correction and a correction for only s subset 
of C and G will be used. The offset-corrected input 
spectra may be considered as 
Z = T*L' + E 

where L = p'-h*l - T*C % G W \ Use of singular value 
decomposition (PCA) can partition the uncentered 
L into two components, 
L = U V 

where U = is a matrix of eigenvalues and v' is a 
matrix of eigenvectors. By substitution, 
Z = T*uV + E 

The product T*U produces offset and interferent 
corrected scores and \/' is the matrix of corre- 
sponding spectra loadings associated with the cor- 
rected scores. 

Multiplicative correction of the offset and inter- 
ferent corrected data Z can be found in the follow- 
ing way: 

Let S be the diagonal matrix containing the ele- 
ments of the product T*d. The fully corrected spec- 
tral data are found by 
S~ 1w Z = S'^uV + S _1 "E 

where S --1 is the inverse of S. The fully corrected 
score matrix W is found in a similar fashion, 
W = S _1 T*U 

W = (W n ) is the matrix of the offset, multiplicative, 
and interferent corrected scores which can be used 
as regressors in additive mixture models etc. 

it is also possible to obtain a set of scores 
which are corrected only for multiplicative effects 
by following the same method, 
W = S~ 1 T 

The above methods may be used for calibration, 
prediction, and determination procedures. Using ei- 
ther of the above two techniques, calibration occurs 
in the following way: 

1 . Applying a bilinear model to a set of spectral 
data in a calibration data set, decompose the 
spectral data into the factor scores T and the 
factor loadings P; 

2. Using a statistical method (e.g. weighted least 
squares), a reference spectra r, and appropriate 
analytes and interferents G, calculate d, h, and 
C from P; 

3. If the spectral scores are to be corrected, 
calculate U (for additive and interferent effects) 
and S" 1 (for multiplicative effects); 

4. Correct the spectral input data after calculat- 
ing b i( aj, a mean , and b mean , 

Y L = [X f . - T*C*'G"' - 3i + a mean ] *b mean /bi ; 
or correct the spectral scores, 
W = S- 1W T*U ; 



5. Use the corrected spectral data Y fit a linear 
model. Methods for achieving this model include 
multiple linear regression, generalized least 
squares, maximum likelihood regression, robust 
5 regression, estimated best linear predictor, par- 
tial least squares, principal component regres- 
sion, Fourier regression, and other techniques. 
Alternately, use the corrected spectral scores 
W to fit a linear model using an appropriate meth- 
w od listed above. 

Prediction occurs in the following way: 
1 . Use an independent set of data and apply the 
factor loadings P to find a new set of spectral 
scores T; 

75 2. To use corrected spectra data, calculate a 
and b from the new spectral scores and use 
3mean» b mean , C and G derived from the calibra- 
tion to determine 

Yi = [X, - C**G"' + a mean - a,] *b mean /b f ; 
20 Alternatively, to use corrected spectral scores, 
calculate a new S _1 from the new spectral 
scores and use U from the calibration data to 
find 

W = S- 1 *T*U ; 

25 3. Use the corrected data and the calibration 
model coefficients from the linear model to pre- 
dict the properties of interest. 
In the description of the alternative embodi- 
ments described immediately above (pp. 35-40), 

30 the apparatus described generally in Figure 2 still 
is applicable as would be understood by one of 
ordinary skill. In construction of some of the more 
detailed blocks, the coefficient estimator 320 de- - 
scribed above is preferably the basic element. For 

35 example, the estimation of an additive model (p. 
35, step 2), is performed by coefficient estimator 
320. The reconstruction of the spectral data (p. 35, 
step 3), is preferably performed by calculator 330. 
The iteration required on page 35, step 5 is con- 

40 trolled by a logic sequencer 370 or equivalent 
Modeling from statistical analyses (principal com- 
ponent analysis or partial least squares, for exam- 
ple) may be accomplished by the structure shown 
in Figure 5. Decomposition of loadings (see page 

45 37) may be performed by the coefficient estimator 
320. Other functions are readily performed by ap- 
paratus disclosed herein. 



so Claims 

1. A method for correcting input spectral data (X ik ) 
derived from a measurement, particularly as to 
multiplicative errors, said method comprising the 
55 steps of: 

providing a first and primary reference spectrum 
(Pok) representing a predetermined standard for 
such data; 
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providing at least one second reference spectrum 
(Pak or Pjk); 

estimating coefficients for a selected appropriate 
model to be applied* to said input data based on 
said first and second reference spectra; and 
correcting said spectral data based on said es- 
timated coefficients at least as to multiplicative 
errors for producing a linear additive structure for 
use in calibration, validation and determination by 
linear multivariate analysis. 

2. A method as in claim 1 wherein said at least one 
second reference spectrum represents the expect- 
ed influence of the analytes of interest on this input 
data (Pak). 

3. A method as in claim 1 wherein said at least one 
second reference spectrum represents the expect- 
ed influence of various undesired interferences 
(Pjk). 

4. A method as in claim 1 wherein said at least one 
second reference spectrum represents the expect- 
ed influence of the analytes of interest on the input 
data (Pak) and said method also includes the step 
of providing at least one third reference spectrum 
(Pjk) representing the expected influence of various 
undesired interferences, said estimating step being 
also based on said at least one third reference 
spectrum. 

5. A method as in claim 1 or claim 2 or claim 3 or 
claim 4 also including the step of correcting said 
spectral data as to additive (offset) errors based on 
said estimated coefficients. 

6. A method as in claim 1 wherein said model is a 
linear model. 

7. A method of claim 5 wherein said model em- 
ploys a generalized least squares technique. 

8. A method of claim 6 wherein said model em- 
ploys a maximum likelihood regression technique. 

9. A method of claim 6 wherein said model em- 
ploys an estimated best linear predictor technique. 

10. A method of claim 6 wherein said model em- 
ploys a principal component regression technique. 

11. A method of claim 6 wherein said model em- 
ploys a covariance adjustment technique. 

12. A method of claim 1 wherein said model is a 
non-linear model. 

13. A method of claim 12 wherein said model 
employs a Taylor expansion technique. 

14. A method of claim 12 wherein said model 
employs a steepest descent method technique. 

15. A method of claim 12 wherein said model 
employs a Marquardt's compromise technique. 

16. A method of claim 12 wherein said model 
employs a simplex optimization technique. 

17. A method as in claim 1 or claim 3 or claim 4 
including the steps of using the coefficients of at 
least one of the interfering components derived by 
the modeling to scale the spectra of these compo- 
nents and subtracting the scaled spectra from the 



data to substantially remove their contribution from 
the data. 

18. A method as in claim 17 including the steps of 
generating modified reference spectra of the inter- 

5 fering components that contain only those portions 
of the original reference spectra of the interfering 
components that are orthogonal to, and therefore 
uncorrelated with, at least one reference analyte 
spectrum, using the coefficients generated from 

10 such orthogonal reference spectrum to scale the 
original reference spectra prior to subtracting the 
scaled spectra from the data. 

19. A method as in claim 18 including the step of 
further scaling the spectra of the interfering compo- 

rs nents to control the degree of spectral modification 
and correction applied to the data. 

20. A method as in claim 17 including the step of 
further scaling the spectra of the analyte data. 

21. A method as in claim 1 or claim 2 or claim 3 or 
20 claim 4 or claim 17 also including the step of 

updating of the standard P kb , the analyte P ka , 
and/or interference P kj spectra based on the results 
of later stages of data processing and analysis. 

22. A method as in claim 21 where said later stage 
25 of data processing analysis includes the technique 

of principal components analysis (PCA). " 

23. A method as in claim 21 where said later stage 
of data processing analysis includes the technique 
of partial least squares (PLS). 

30 24. A method as in claim 1 or claim 2 or claim 3 or 
claim 4 or claim 17 including the step of inter- 
actively displaying graphical output concerning 
which analytes and interference spectra, if any, are 
causing difficulties with respect to estimation of the 

35 multiplicative correction and interactive control over 
which reference spectra are utilized, the spectral 
range included in estimating the coefficients and 
the weighing of the additive corrections employed. 

25. A method as in claim 24 including the step of 
40 further scaling the spectra of the interfering compo- 
nents to control the degree of spectral modification 
and correction applied to the data. 

26. A method as in claim 24 also including the step 
of further scaling the spectra of the analyte data. 

45 27. Apparatus for correcting input spectral data 
(X ik ) derived from a measurement, particularly as to 
multiplicative errors, said apparatus comprising: 
input means for supplying a signal representing 
spectral data subject to correction for at least mul- 

50 tiplicative errors; 

means for supplying a signal representing a first 
and primary reference spectrum (P ok ) as a pre- 
determined standard for such data; 
means for supplying a signal representing at least 

55 one second reference spectrum (P ka or P k] ); 

means for estimating coefficients for a selected 
model for application to said input spectral data, 
said input spectral and first and second reference 
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spectral signals being supplied to said estimating 
means; and 

means responsive to said estimating means for 
correcting said spectFal data based on estimated 
coefficients at least as to multiplicative errors for 
producing a signal representing a linear additive 
structure for use in calibration, validation and deter- 
mination by linear multivariate analysis. 

28. Apparatus as in claim 27 wherein said means 
for supplying said signal representing at least one 
second reference spectrum supplies a signal which 
represents at least one spectrum representing the 
expected influence of the analytes of interest of the 
input data (P ak ). 

29. Apparatus as in claim 27 wherein said means 
for supplying said signal representing at least one 
second reference spectrum supplies a signal which 
represents at feast one spectrum representing the 
expected influence of various undesired interfer- 
ences (P jk ). 

30. Apparatus as in claim 27 wherein said means 
for supplying said signal representing at least one 
second reference spectrum supplies a signal which 
represents at least one spectrum representing the 
expected influence of the analytes of interest of the 
input data and also including means for supplying 
a signal of at least one third reference spectrum 
representing the expected influence of various un- 
desired interferences. 

31 . Apparatus as in claim 27 or claim 28 or claim 
29 or claim 30 including means responsive to said 
estimating means for correcting said spectral data 
as to additive (offset) errors based on said es- 
timated coefficients. 

32. Apparatus as in claim 27 or claim 28 or claim 
29 or claim 30 wherein said estimating means 
estimates coefficients for a linear model. 

33. Apparatus as in claim 27 or claim 28 or claim 
29 or claim 30 wherein said estimating means 
estimates coefficients for a non-linear model. 

34. In a system for analyzing a medium, said 
system having a spectrophotometric sensor for 
providing a signal representing input spectral data 
(X ik ), the improvement comprising: 

apparatus for correcting said spectral data for at 
least multiplicative errors, said apparatus including: 
means for supplying a signal representing a first 
and primary reference spectrum (P ok ) as a pre- 
determined standard for said spectral data; 
means for supplying a signal representing at least 
one second reference spectrum (P ka or P kj ); 
means for estimating coefficients for a selected 
model for application to said input spectral data, 
said input spectral and first and second reference 
spectral signals being supplied to said estimating 
means; and 

means responsive to said estimating means for 
correcting said spectral data based on estimated 



coefficients at least as to multiplicative errors for 
producing a signal representing a linear additive 
structure for use in calibration, validation and deter- 
mination based on linear multivariate analysis. 
5 35. A method for correcting input spectral data (X, k ) 
derived from a measurement, particularly as to 
multiplicative errors, by the fitting of an additive 
model and multiplicative model in a sequential 
fashion, comprising the steps of: 
70 1) obtaining a set of spectral data Z from the 
original input spectral data corrected for mul- 
tiplicative effects by using a standard multiplica- 
tive scatter correction technique; 

2) estimating an additive model which takes into 
75 account spectral variations of analytes and inter- 
ferences; 

3) reconstructing the spectral data Y without the 
estimated additive effects; 

4) estimating the multiplicative effects on Y; 

20 5) constructing a new matrix of corrected spec- 
tra Z from X; and 

repeating steps 2, 3 and 4 until convergence 
occurs. 

36. The method of claim 35 wherein the additive 
25 effects are obtained from a different model than the 

multiplicative effects. 

37. The method of claim 36 wherein one model is 
a physical model and the other model is a chemi- 
cal model. 

30 38. The method of claim 35 wherein step 4 in- 
cludes the steps of: 

providing a first and primary reference spectrum; 
providing at least one second reference spectrum; - 
estimating coefficients for a selected appropriate 
35 model to be applied to the input data based on 
said first and second reference spectra; and 
correcting said spectral data based on said es- 
timated coefficients. 

39. A method for correcting input spectral data (X ik ) 
40 derived from a measurement or for correcting the 
scores obtained by bilinear modeling of such data, 
said method comprising the steps of: 
examining the measured spectral data as a com- 
bined additive and multiplicative structure such that 
45 X = TP' + E 

where X = (x ik ) is the matrix of spectral ordinates for 
i = 1,2...N Objects, k = 1,2...,K wavelengths, T = (t ie ) 
is the matrix of scores for objects i, I = 1 ,2...L repre- 
senting bilinear factors obtained from a bilinear 
so model are the loadings for objects i on bilinear 
factors I, and E = (e ik ) are the residuals between 
data X and model T*p'; 

decomposing the loadings into a function of a 
reference spectra and, optionally, a matrix of spec- 
55 tral components for analyte and interference phe- 
nomena; 

obtaining the additive and multiplicative effects for 
the input spectra from the coefficients from the 
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loading decomposition and scores; and 
correcting said input spectra based on said ob- 
tained additive and multiplicative effects. 

40. The method of claim 39 including the step of 
correcting said input spectral data simultaneously 5 
for interferent contributions and additive and mul- 
tiplicative effects. 

41 . A method for correcting input spectral data (X ik ) 
derived from a measurement or for correcting the 
scores obtained by bilinear modeling of such data, 10 
said method comprising the steps of: 

examining the measured spectral data as a com- 
bined additive and multiplicative structure such that 
X = T*p' + E 

where X = (x ik ) is the matrix of spectral ordinates for 75 
i = 1,2...N objects, k=1,2... ( K wavelengths, T = (t ie ) 
is the matrix of scores for objects i, I = 1 ,2...L repre- 
senting bilinear factors obtained from a bilinear 
model are the loadings for objects i on bilinear 
factors I, and E = (e ik ) are the residuals between 20 
data X and model T*p'; 

decomposing the loadings into a function of a 
reference spectrum and, optionally, a matrix of 
spectral components for analyte and interference 
phenomena; 25 
obtaining the additive effects for the scores T from 
the offset corrected loadings and the scores T; and 
applying the obtained additive effects to the mul- 
tivariate scores; 

obtaining the multiplicative effect from the coeffi- 30 
cient d from the reference spectrum; and 
obtaining scores corrected for additive and mul- 
tiplicative effects using the multiplicative effect, or 
scores T and, optionally, the additive effect. 

42. The method of claim 41 including the steps of 35 
additionally correcting the input spectral data mul- 
tiplicative effects. 

43. The method of claim 35 or claim 39 or claim 41 
including the further step of using said corrected 
input spectral data or said corrected scores for 40 
calibration, prediction and determination proce- 
dures. 

44. The method of claim 43 wherein the prediction 
procedure, using corrected spectral data, includes 

the steps of: 45 
using an independent set of data and applying 
factor loadings P to find a new set of spectral 
scores T; and 

determining the additive and multiplicative effects 
from the new spectral scores and using the mean so 
quantities of additive and multiplicative effects and 
those portions of the interferents and analytes 
which had been used in a prior calibration to obtain 
a corrected spectra; and 

using the corrected data and the calibration model 55 
coefficients from the linear model coefficients from 
the linear model to predict properties of interest. 

45. The method of claim 43 wherein the prediction 
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procedure, using created spectral scores, includes 
the steps of: 

using an independent set of data and applying 
factor loadings P to find a new set of spectral 
scores T; and 

determining an S~ 1 factor related to a matrix con- 
taining T*d for multiplicative effects from the new 
spectral scores T and using a factor U from calibra- 
tion relating to additive and interferent effects to 
find 

W = S-'THJ; and 

using the corrected data and the calibration model 
coefficients from the linear model to predict prop- 
erties of interest. 

46. Apparatus for correcting an input spectral data 
(Xjk) signal derived from a measurement, particu- 
larly as to multiplicative errors, by the fitting of an 
additive model and multiplicative model in sequen- 
tial fashion, comprising: 

first means for obtaining a signal representing a set 
of spectral data Z from the original input spectral 
data signal corrected for multiplicative effect by 
using a standard multiplicative scatter correction 
technique; 

second means for providing a signal representing 
estimate of an additive model which takes into 
account spectral variations of analytes and interfer- 
ences; 

third means for providing a signal representing the 
reconstructing of spectral data Y without the es- 
timated additive effects; 

fourth means for providing a signal representing an 
estimate of the multiplicative effects on Y; 
fifth means for providing a signal representing the 
construction of a new matrix of corrected spectra Z 
from X; 

means for respectively providing said signal from 
said fifth means to said second, third and fourth 
means; and 

means responsive to the output signals of said 
second, third and fourth means for determining 
when convergence occurs. 

47. Apparatus for correcting an input spectral data 
(X ik ) signal derived from a measurement or for 
correcting a signal representing the scores ob- 
tained by bilinear modeling of such data, compris- 
ing: 

means for providing a signal responsive to said 
input signal representing the measured spectral 
data as a combined additive and multiplicative 
structure such that 
X = T*p' + E 

where X = (x ik ) is the matrix of spectral ordinates for 

i.= 1 ,z...N objects, k=1,2 k wavelengths, T = (tj e ) 

is the matrix of scores for objects i, l = 1,z...L repre- 
senting bilinear factors obtained from a bilinear 
model are the loadings for objects i on bilinear 
factors I and E = (e ik ) are the residuals between 
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data X and model TP'; 

means for providing a signal representing decom- 
posed loadings which have been decomposed into 
a function of a reference spectra and, optionally, a 
matrix of spectral components for analyte and in- s 
terference phenomena; 

means for providing a signal representing the ad- 
ditive and multiplicative effects for the input spectra 
obtained from the coefficients from the loading 
decomposition and scores; and 10 
means for providing a signal representing the cor- 
rection of said input spectra from said signal repre- 
senting the additive and multiplicative effects. 
48. Apparatus for correcting an input spectral data 
(X ik ) signal derived from a measurement or for rs 
correcting a signal representing the scores ob- 
tained by linear modeling of such data, comprising: 
means for providing a signal responsive to said 
input signal representing the measured spectral 
data as a combined additive and multiplicative 20 
structure such that 
X = T*p' + E 

where X = (x }k ) is the matrix of spectral ordinates for 

i = 1,z...N objects, k = 1,2 k wavelengths, T = (t ie ) 

is the matrix of scores for objects i, l = 1 ,z...L repre- 25 
senting bilinear factors obtained from a bilinear 
model are the loadings for objects i on bilinear 
factors I and E = (e ik ) are the residuals between 
data X and model T*p'; 

means for providing a signal representing decom- 30 
posed loadings which have been decomposed into 
a function of a reference spectra and, optionally, a 
matrix of spectral components for analyte and in- 
terference phenomena; 

means for providing a signal representing the ad- 35 

ditive effects for the scores T obtained from the 

offset corrected loadings and the scores T; 

means responsive to said signal representing the 

additive effects for providing a signal representing 

the application of the additive effects to the mul- 40 

tivariate scores; 

means for providing a signal representing the mul- 
tiplicative effect from the coefficient d from the 
reference spectrum; and 

means for providing a signal representing scores 45 
corrected for additive and multiplicative effects us- 
ing the multiplicative effect, or scores T and, op- 
tionally the additive effect. 
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© An improved method and apparatus are dis- 
closed for processing spectral data to remove un- 
desired variations in such data and to remove inter- 
fering information present in the data. The method 
and apparatus corrects multiplicative effects present 
in the spectral data. Additive and interferent con- 
tributions can be corrected as well. In one aspect of 
the method, coefficients for a selected appropriate 
model are applied to the input spectral data based 
on first and second reference spectra. The spectral 
data are then corrected based on the estimated 
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coefficients at least as to multiplicative errors for 
producing a linear additive structure for use in cali- 
bration, validation and determination by linear mul- 
tivariate analysis. The method and apparatus will 
improve the accuracy of spectral data structures 
derived from measurements using spectroscopy, 
chromatography, thermal analysis, mechanical vibra- 
tion and acoustic analysis, rheology, electrophoresis, 
image analysis and other analytical technologies pro- 
ducing data of similar multivariate nature. 
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